Zone design for statistical disclosure control in administrative and linked microdata

نویسندگان

  • David Martin
  • Chris Gale
چکیده

To explore the application of automated zone design tools to protect record-level datasets with attribute detail and a large data volume in a way that might be implemented by a data provider (e.g. National Statistical Organisation/Health Service Provider), initially using a synthetic microdataset. Successful implementation could facilitate the release of rich linked record datasets to researchers so as to preserve small area geographical associations, while not revealing actual locations which are currently lost due to the high level of geographical coding required by data providers prior to release to researchers. Data perturbation is undesirable because of the need for detailed information on certain spatial attributes (e.g. distance to a medical practitioner, exposure to local environment) which has driven demand for new linked administrative datasets, along with provision of suitable research environments. The outcome is a bespoke aggregation of the microdata that meets a set of design constraints but the exact configuration of which is never revealed. Researchers are provided with detailed data and suitable geographies, yet with appropriately reduced disclosure risk.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disclosure risk assessment in statistical microdata protection via advanced record linkage

The performance of Statistical Disclosure Control (SDC) methods for microdata (also called masking methods) is measured in terms of the utility and the disclosure risk associated to the protected microdata set. Empirical disclosure risk assessment based on record linkage stands out as a realistic and practical disclosure risk assessment methodology which is applicable to every conceivable maski...

متن کامل

Disclosure Control Methods and Information Loss for Microdata

Statistical disclosure control (SDC) seeks to modify statistical data so that they can be published without giving away confidential information that can be linked to specific respondents. The challenge for SDC is to achieve this modification with minimum loss of the detail and accuracy sought by database users. SDC methods for microdata are usually known as masking methods, of which there is a...

متن کامل

Anonymization of statistical data

In the modern digital society, personal information about individuals can be collected, stored, shared, and disseminated much more easily and freely. Such data can be released in macrodata form, reporting aggregated information, or in microdata form, reporting specific information on individual respondent. Protecting data against improper disclosure is then becoming critical to ensure proper pr...

متن کامل

Assessing the Statistical Disclosure Risk of a Demographic Microdata File

There are two recent developments related to survey data dissemination that may be increasing the risk of disclosure of respondent data. One is that statistical agencies are now releasing more microdata files than previously, partly in response to the urging of researchers needing the data for precise analytic work. For example, some data rich files with possibly high disclosure risk, that have...

متن کامل

On Recent Developments in Statistical Disclosure Control Techniques

Disclosure control of microdata sets is an important practical topic for statistical agencies. It is also a very interesting problem for theoretical statisticians from the viewpoint of statistical inference. In recent years some significant theoretical developments have been achieved by a group of Japanese researchers including the author. Here we summarize some of our results.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017